Appellants' Post Appeal Amendment 
S/N: 09/848,430 

AMENDMENTS TO THE CLAIMS: 

1 . (Currently amended) A method of converting a document corpus containing an ordered 
plurality of documents into a compact representation in memory of occurrence data, said 
method comprising: 

developing a first vector for said entire document corpus, said first vector being a 
listing of integers corresponding to terms in said documents such that each said document 
in said document corpus is sequentially represented in said listing ; and 

developing a second vector for said entire document corpus, said second vector 
indicating the location of each said document's representation in said first vector . 

2. (Currently amended) The method of claim -tS I, further comprising: 

developing a third vector for said entire document corpus, said third vector 
comprising a sequential listing of floating point multipliers, each said floating point 
multiplier representing a document normalization factor. 

3. (Currently amended) The method of claim IS 1, further comprising: 

rearranging, in said first vector, an order of said unique integers within the data for 
each said document so that all identical unique integers are adjacent. 
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4. (Original) The method of claim 2, wherein said normalization factor is calculated as: 

NF = 1/ (S x; 2 ) 1/2 , where x ; is the number of occurrences of a specific term in said 
document, so that NF represents the reciprocal of the square root of the sum of squares of 
all term occurrences in said document. 

5. (Currently amended) A method of converting, organizing, and representing in a 
computer memory a document corpus containing an ordered plurality of documents, said 
method comprising: 

for said document corpus, taking in sequence each said ordered document and 
developing a first uninterrupted listing of integers to correspond to an occurrence of terms 
in the document corpus ; and 

developing a second uninterrupted listing for said entire document corpus, said 
second uninterrupted listing containing, in sequence, the location of each corresponding 
document in said first uninterrupted listing . 

6. (Currently amended) The method of claim ±9 5, further comprising: 

developing a third uninterrupted listing for said entire document corpus, said third 
uninterrupted listing containing a sequential listing of floating point multipliers, each said 
floating point multiplier representing a document normalization factor for a corresponding 
document in said document corpus. 
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7. (Currently amended) The method of claim 49 5, further comprising: 

for each said document in said document corpus, rearranging said unique integers 
so that any identical integers are adjacent. 

8. (Original) The method of claim 6, wherein said normalization factor is calculated as: 

NF = 1/ (S x; 2 ) 1/2 , where x ; is the number of occurrences of a specific term in said 
document, so that NF represents the reciprocal of the square root of the sum of squares of 
all term occurrences in said document. 

9. (Currently amended) An apparatus for organizing and representing in a computer 
memory a document corpus containing an ordered plurality of documents, said apparatus 
comprising: 

an integer determining module receiving in sequence each said ordered document 
of said document corpus and developing a first uninterrupted listing of unique integers to 
correspond to an occurrence of terms in the document corpus ; and 

a locator module developing a second uninterrupted listing for said entire document 
corpus, said second uninterrupted listing containing, in sequence, the location of each 
corresponding document in said first uninterrupted listing . 

10. (Original) The apparatus of claim 9, further comprising: 

a normalizer developing a third uninterrupted listing for said entire document 
corpus, containing a sequential listing of floating point multipliers, each said floating point 
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multiplier representing a document normalization factor for a corresponding document in 
said document corpus. 

11. (Original) The apparatus of claim 9, further comprising: 

a rearranger rearranging said unique integers so that any identical integers for each 
said document in said document corpus are adjacent. 

12. (Original) The apparatus of claim 10, wherein said normalizer calculates said 
normalization factor as: 

NF = 1/ (S x; 2 ) 1/2 , where x ; is the number of occurrences of a specific term in said 
document, so that NF represents the reciprocal of the square root of the sum of squares of 
all term occurrences in said document. 

13. (Currently amended) A signal-bearing storage medium tangibly embodying a program 
of machine-readable instructions executable by a digital processing apparatus to perform a 
method to organize and represent in a computer memory a document corpus containing an 
ordered plurality of documents, said method comprising: 

developing a first uninterrupted listing of unique integers to correspond to the 
occurrence of terms in the document corpus ; and 

developing a second uninterrupted listing for said entire document corpus, said 
second uninterrupted listing containing, in sequence, the location of each corresponding 
document in said first uninterrupted listing . 
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14. (Currently amended) The signal-bearing storage medium of claim 24 13, wherein said 
method further comprises: 

developing a third uninterrupted listing for said entire document corpus, containing 
a sequential listing of floating point multipliers, each said floating point multiplier 
representing a document normalization factor for a corresponding document in said 
document corpus. 

15. (Previously presented) A data converter for organizing and representing in a 
computer memory a document corpus containing an ordered plurality of documents, for 
use by a data mining applications program requiring occurrence-of-terms data, said 
representation to be based on terms in a dictionary developed for said document corpus and 
wherein each said term in said dictionary has associated therewith a corresponding unique 
integer, said data converter comprising: 

means for developing a first uninterrupted listing of said unique integers to 
correspond to the occurrence of dictionary terms in the document corpus and; and 

means for developing a second uninterrupted listing for said entire document 
corpus containing in sequence the location of each corresponding document in said first 
uninterrupted listing, wherein said first listing and said second listing are provided as input 
data for said data mining applications program. 
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16. (Original) The data converter of claim 15, further comprising: 

means for developing a third uninterrupted listing for said entire document corpus, 
containing a sequential listing of floating point multipliers, each said floating point 
multiplier representing a document normalization factor for a corresponding document in 
said document corpus. 

17. (Original) The data converter of claim 15, further comprising: 

means for rearranging said unique integers so that any identical integers for each 
said document in said document corpus are adjacent. 

18. (Previously presented) The method of claim 1, further comprising: 

developing a dictionary comprising said terms contained in said document corpus; 

and 

associating, with each said dictionary term, an integer to be uniquely corresponding 
to said dictionary term, said uniquely corresponding integers being said integers 
comprising said first vector. 

19. (Canceled) 

20. (Previously presented) The method of claim 5, further comprising: 

developing a dictionary comprising terms contained in said document corpus; and 



Docket ARC920000023US1 



7 



Appellants' Post Appeal Amendment 
S/N: 09/848,430 

associating, with each said dictionary term, an integer to be uniquely corresponding 
to said dictionary term, said uniquely corresponding integers used in said first 
uninterrupted listing. 

21. (Canceled) 

22. (Previously presented) The apparatus of claim 9, further comprising: 

a dictionary developing module to develop a dictionary of terms contained in said 
document corpus, each said term being associated with a corresponding unique integer. 

23. (Canceled) 

24. (Currently amended) The signal-bearing storage medium of claim 13, wherein said 
method further comprises: 

developing a dictionary comprising terms contained in said document corpus; and 
associating, with each said dictionary term, an integer to be uniquely corresponding 

to said dictionary term, said uniquely corresponding integers used in said first 

uninterrupted listing. 

25. (Canceled) 
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